Efficient Collaborative Application Monitoring Scheme for Mobile Networks 



New operating systems for mobile devices allow their users to download millions of applications 
created by various individual programmers, some of which may be malicious or flawed. In order to 
detect that an application is malicious, monitoring its operation in a real environment for a significant 
period of time is often required. Mobile devices have limited computation and power resources and 
thus are limited in their monitoring capabilities. In this paper we propose an efficient collaborative 
monitoring scheme that harnesses the collective resources of many mobile devices, "vaccinating" 
them against potentially unsafe applications. We suggest a new local information flooding algorithm 
called TTL Probabilistic Propagation (TPP). The algorithm is implemented in any mobile device, 
periodically monitors one or more application and reports its conclusions to a small number of other 
mobile devices, who then propagate this information onwards, whereas each message has a predefined 
TTL. The algorithm is analyzed, and is shown to outperform existing state of the art information 
propagation algorithms, in terms of convergence time as well as network overhead. The maximal 
load of the algorithm (the fastest arrival rate of new suspicious applications, that can still guarantee 
complete monitoring), is analytically calculated and shown to be significantly superior compared to 
any non-collaborative approach. Finally, we show both analytically and experimentally using real 
world network data received among others from the Reality Mining Project, that implementing the 
proposed algorithm significantly reduces the number of infected mobile devices. In addition, we 
analytically prove that the algorithm is tolerant to several types of Byzantine attacks where some 
adversarial agents may generate false information, or abuse the algorithm in other way. 

I. INTRODUCTION 

The market share of Smart-phones is rapidly increasing and is expected to increase even faster with the introduction 
of A th generation mobile networks, reaching from 350 million in 2009 to one billion by 2012 [9]. Companies that are 
distributing new mobile devices operating systems had created marketplaces that motivate individuals and other 
companies to introduce new applications (such as Apple's App Store Google's Android Market, Nokia's Ovi Store 
and others). The main assumption behind these marketplaces is that users will prefer a mobile device based on 
an operating system with larger marketplace offerings. It is expected that in the future, various communities will 
develop additional alternative marketplaces that will not be regulated. These marketplaces will allow mobile users to 
download from a variety of millions of new applications. An example for such a marketplace is GetJar, offering 60,000 
downloadable applications for over 2,000 models of mobile devices, counting a total of over 900 million downloads by 
August 2010 [25]. The content of most marketplaces is currently not verified by their operators and thus some of the 
applications they contain may be malicious or contain faulty code segments. Downloading a malicious application from 
the marketplace is not the only way that a mobile device may be infected by malicious code. This may also happen 
as a result of a malicious code that manages to exploit a vulnerability in the operating systems and applications or 
through one of the mobile phone communication channels such as Bluetooth, Wi-Fi, etc' [26, 29]. McAfee's Mobile 
Security Report for 2008 states that nearly 14% of global mobile users have been directly infected or have known 
someone who was infected by a mobile virus (this number had even increased in the following year) [35, 36]. In 
many cases, in order to detect that an application is malicious, monitoring its operation in a real environment for 
a significant period of time is required. The monitored data is being processed using advanced machine learning 
algorithms in order to assess the maliciousness of the application. For a variety of techniques for local monitoring of 
mobile phone applications see [28, 37-39]. 

In recent years most of the prominent security and privacy threats for communication networks had relied on the use 
of collaborative methods (e.g., Botncts). The danger stemming from such threats is expected to significantly increase 
in the near future, as argued in [29, 49] and others. The amount of resources a single unit may allocate in order to 
defend from such threats without interfering with its routine work is very limited. Therefore, the development of an 
efficient collaborative defense infrastructure for mobile users is strongly required [51]. 

In this work we propose such a collaborative application monitoring infrastructure, that is capable of dramatically 
decreasing the susceptibility of mobile devices to malicious applications. Indeed, for every new malicious application 
that is introduced to the network, there will always be a few users that will encounter it for the first time, and 
subsequently may be exposed to its negative effects. However, by wisely assimilating this information throughout the 
network, based on their (poor) experience the vast majority of the mobile devices will be notified on the properties 
of the new malicious application, well ahead before encountering it. This behavior resembles mammals vaccination 
system, as a new threat have some (small) probability of damaging the organism (or network), but once defeated, all 
the cells of the organism are familiar with this threat, and will be able to easily overcome it (or avoid installing it) 
should they encounter it in the future. As the portion of "new" applications (namely, applications that no network 
user had ever experienced with) is rather small, the mobile devices are kept protected at all times against the vast 
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majority of potentially harmful marketplace applications (this property of the proposed scheme is illustrated in Figure 

This paper presents and analyzes an algorithm that efficiently implements this proposed scheme. The algorithm is 
shown to provide fast convergence, low network overhead, and is capable of handling a significantly higher number of 
potential malicious applications concurrently, than any other collaborative or non collaborative approach. In addition, 
the algorithm is fault tolerance for several known adversarial and Byzantine attacks. These features of the algorithm 
are shown both analytically and empirically, using simulation of various random networks, as well as using real life 
network that are based on MIT's Reality Mining project[17]. 




NO COLLABORATIVE MONITORING COLLABORATIVE VACCINATION 

FIG. 1: Each month mobile users download and install various kinds of mobile applications. These applications are comprised 
of "new applications" (namely, applications that were only introduced to the marketplace recently), and "old applications" 
(applications that were released to the marketplace in previous months). Whereas new applications may contain a few ma- 
licious applications, the vast majority of malicious applications are already in the marketplace for some time. Using the 
proposed scheme, users of mobile devices become "vaccinated" against all of the potentially harmful applications among the 
old applications. As a result, the number of infected mobile devices decreases dramatically. 

In this paper we present a collaborative monitoring algorithm, called TPP — TTL Probability Propagation. The 
algorithm uses a time to live counter for each alerting message, that is logarithmic in the number of the mobile 
devices, n. Using these short lived messages that are sent between the network's devices the algorithm is analytically 
shown to provide a prior knowledge of every potential malicious application to the vast majority of the network users. 
The fastest arrival rate of new applications for which this property can still be maintained (e.g. the maximal load 
of the algorithm) is analytically calculated (see Theorem 5), shown to be far superior than any non-collaborative 
host-based monitoring approach (Corollary 1) or collaborative approach. Furthermore, we show that the local cost of 
the algorithm (namely, the amount of local efforts required from each participating user) monotonically decreases as 
the network's size increases. 

Using real- world numbers, implemented as a service executed by 1,000,000 mobile devices, assuming that each device 
can locally monitor only a single application per week, and that the requested vaccination efficiency is 99% (namely, 
we allow only 1% of the network's users to be susceptible to attacks), the system is guaranteed to collaboratively 
monitor 25,566 new applications each month (achieved using an average of 2.763 messages per device per day). A 
single non-collaborative device under the same requirements would be able to monitor only 5 new applications each 
month. A similar result was also demonstrated on a real world network, using the data collected in the Reality Mining 
Project [17] (see Section IX). 

In addition, the algorithm is shown to be partially fault tolerant to the presence of Byzantine devices, that are 
capable of distributing messages containing false information, or selectively refuse to forward some of the messages 
they receive [52] (see Theorem 6 in Section VI discussing devices who distribute false messages and Corollaries 3, 4 
and 5 in Section VII concerning devices who block passing messages). 

Furthermore, we show that the cost of defending against adversarial abuse of the proposed algorithm grows asymp- 
totically slow whereas the cost of increasing the strength of such attacks grows approximately linearly. This property 
of the algorithm makes it economically scalable, and desirable to network operators (see more details in Section VI 
and specifically an illustration that appears in Figure 4). 

The efficiency of the algorithm stems from the generation of an implicit collaboration between a group of random 
walking agents, released from different sources in the network (and at different times). This technique is novel, as 
most related works do not take into consideration that the agents might be released collaboratively from different 
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sources [15, 16]. Hence, analysis of such systems was limited to the probabilistic analysis of the agents' movements. 

The rest of this work is organized a follows : Related work and comparison to state-of-the-art techniques are 
presented in Section II. The threat model is formally defined in Section III. The TPP collaborative algorithm is 
presented in IV, and its performance is analyzed. The algorithm's robustness with regards to attacks by adversaries is 
discussed in Sections VI and VII. An alternative model which does not require a random network overlay assumption 
is discussed in Section VIII. A wealth of experimental results are presented in Section IX, where conclusions and 
suggestions for future research appears in Section X. In addition, this paper contains an appendix, discussing in 
details several issues that are mentioned during the paper. 

II. RELATED WORK 

Flooding a network with messages intended for a large number of nodes is arguably the simplest form of information 
dissemination in communication networks (specifically when previous knowledge about the network topology is limited 
or unavailable). Since the problem of finding the minimum energy transmission scheme for broadcasting a set of 
messages in a given network is known to be NP-Complete [8], flooding optimization often relies on approximation 
algorithms. For example, in [43] messages are forwarded according to a set of predefined probabilistic rules, whereas 
[42] discusses a deterministic algorithm, approximating the connected dominating set of each node. 

In this work we apply a different approach — instead of a probabilistic forwarding of messages, we assign a TTL 
value for each message, guiding the flooding process. The analysis of the algorithm is done by modeling the messages 
as agents practicing random walk in a random graph overlay of the network[53]. The optimal value of this TTL is 
shown, guaranteeing a fast completion of the task, while minimizing the overall number of messages sent. The use of 
a TTL dependent algorithm was previously discussed in [1] (demonstrating 0(log 2 n) propagation completion time) 
and [34] (where no analytic bound over the completion time was demonstrated). 

Flooding Algorithms. The simplest information propagation technique is of course the flooding algorithm. It is well 
known that the basic flooding algorithm, assuming a single source of information, guarantees propagation completion 
in a worse case cost of 0(n 2 ) messages (0(|-E|) in expanders) and time equals to the graph's diameter, which in 
the case of a random graph G{n,p) equals O ( i^^Tpj ) ~ O(logn) [12, 18]. We later show that the TPP algorithm 
discussed in this work achieves in many cases the same completion time, with a lower cost of 0{n log n) messages 
(Theorems 3 and 4). 

Other flooding algorithms variants use various methods to improve the efficiency of the basic algorithm. One 
example is a probabilistic forwarding of the messages in order to reduce the overall cost of the propagation process, 
optionally combined with a mechanism for recognizing and reducing repetitions of packets transitions by the same 
device [41]. Other methods may include area based methods [41] or neighborhood knowledge methods [44]. In many of 
these works, completion time is traded for a reduced overall cost, which results in a similar cost as the TPP algorithm 
proposed in this work (namely, O(nlogn)), but with a significantly higher completion time. Additional details on 
various flooding algorithms can be found in [50]. 

An extremely efficient flooding algorithm in terms of completion time is the network coded flooding algorithm, 
discussed in [13]. In this work, dedicated to G(n,p) random graphs, a message is forwarded by any receiving vertex 
times, while k is a parameter which depends on the network's topology [21]. Using this method, the algorithm 

achieves a completion time of approximately O(pjj). This algorithm, however, is still outperformed by the TPP 
algorithm. 

Epidemics Algorithms. An alternative approach to be mentioned in this scope is the use of epidemic algorithms [5, 
47]. There exist a variety of epidemic algorithms, starting with the basic epidemic protocol [14], through neighborhood 
epidemics [22] and up to hierarchical epidemics [46]. In general, all the various epidemic variants has a trade-off 
between number of messages sent, completion time, and previous knowledge required for the protocols (concerning 
the structure of the network). However, the most efficient results of such approaches are still outperformed by the 
TPP algorithm. 

Distributed Coverage Algorithms. A different approach for a collaborative assimilation of an important piece 
of information throughout the network is the use of cooperative exploration algorithms (in either known or unknown 
environments) , guaranteeing that all (or a large enough portion) of the graph is being "explored" by agents carrying the 
alerting messages. Planar networks can be sampled into R 2 , and then be collaboratively explored by a decentralized 
group of myopic agents (see swarm coverage examples in [2, 31]). 

In [45] a swarm of ant-like robots is used for repeatedly covering an unknown area, using a real time search method 
called node counting. Using this method, the agents are analytically shown to cover the network graph efficiently 
Another algorithm to be mentioned in this scope is the LRTA* algorithm [32], that was shown in [30] to guarantee 
cover time of 0(n 2 ) in degree bounded undirected planar graphs. Interestingly, in such graphs, the random walk 
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algorithm is also known to require at most 0(n 2 ) time (and at least 17(n(logn) 2 )) [27]. 

In [48] a collaborative algorithm that guarantees coverage of regions in the Z 2 grid, in (^(n 1 ' 5 ) time, using extremely 
simple and myopic "ant like" mobile agents and using no direct communication by the agents is discussed. In [3] 
this algorithm was later shown to guarantee (under some limitations) coverage of dynamically expanding domains in 
0(n 2 5 logn) time. 

Summary. The maximal arrival rate of new applications for which a successful decentralized monitoring can still be 
guaranteed (namely, the maximal load), of the TPP algorithm is analyzed in Section V. 

As previously discussed, the efficiency of the TPP algorithm is derived from the fact that participating devices form 
a collaborative infrastructure, using alerting messages that can be sent between each two users — which implicitly 
assumes the existence of an appropriate network overlay. The algorithm can still be implemented for any given 
network topology, assuming no network overlays. This variant of the algorithm is discussed in Section VIII. It should 
also be mentioned that as the TPP algorithm uses random elements, it requires approximately 0(ln 2 n) random bits 
by each device for its proper execution. 

III. THE COLLABORATIVE APPLICATION MONITORING PROBLEM 

Given a mobile network of n devices, let us denote the network's devices by V — {vi,v 2 , ■ ■ ■ ,v n }. Note that 
the network's topology may be dynamic[54]. Each device may occasionally visit the marketplace, having access to 
N new downloadable applications every month. We assume that downloading of applications is done independently, 
namely — that the probability that a user downloads application a\ and the probability that the same user downloads 
application a 2 are uncorrelated. Let us denote the average number of applications each user downloads each month 
by ?y, out of which r\ n are applications that were introduced to the marketplace this month ("new applications") 
and T) are applications that were introduced to the marketplace in previous months ("old applications"), such that 

Vo +Vn = V- 

For some malicious application a^, let p ai denote the application's penetration probability — the probability that 
given some arbitrary device v, it is unaware of the maliciousness of a^. The penetration probability of every new 
malicious application is 1. Our goal is to verify that at the end of the month, the penetration probability of all malicious 
applications released during this month are lower than a penetration threshold Pmax , resulting in a "vaccination" of 
the network with regards to these applications. This way, although a new malicious application may infect a handful 
of devices — the first one it encounters — the rest of the network would quickly become immune to it. Formally, for 
some small e we require that : 

V Malicious application ai Prob (p a . > Pmax ) < e 

The rational behind the use of the threshold pmax is increasing the efficiency of the collaborative system, defending 
against dominant malicious applications. Given additional available resources, the parameter pmax can be decreased, 
resulting in a tighter defense grid (traded for decreased load and increased messages overhead). 

We assume that any device v can send a message of some short content to any other device u. In addition, we 
assume that at the initialization phase of the algorithm each device is given a list containing the addresses of some X 
random network members. This can be implemented either by the network operator, or by distributively constructing 
and maintaining a random network overlay. 

A revised model that does not require this assumption is analyzed in Section VIII. 

We assume that each device can locally monitor applications that are installed on it (as discussed for example in 
[4, 38]). However, this process is assumed to be expensive (in terms of the device's battery), and should therefore be 
executed as fewer times as possible. The result of an application monitoring process is a non-deterministic boolean 
value : {true, false}. 

False-positive and false-negative error rates are denoted as : 

P(Monitoring(ai) = true\Ai is not malicious) = E + 

P(Monitoring(ai) — false\Ai is malicious) = E_ 
We assume that the monitoring algorithm is calibrated in such way that E + rts 0. 

As we rely on the propagation of information concerning the maliciousness of applications, our system might be 
abused by injection of inaccurate data. This may be the result of a deliberate attack, aimed for "framing" a benign 
application (either as a direct attack against a competitive application, or as a more general attempt for undermining 
the system's reliability), or simply as a consequence of a false-positive result of the monitoring function. Therefore, 
in order for a device v to classify an application a* as malicious, one of the following must hold : 
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• Device v had monitored a, and found it to be malicious. 

• Device v had received at least p alerts concerning di from different sources (for some decision threshold p). 

In addition, note that the information passed between the devices concerns only applications discovered to be 
malicious. Namely, when an application is found to be benign, no message concerning this is generated. This is 
important not only for preserving a low messages overhead of the algorithm but also to prevent malicious applications 
from displaying a normative behavior for a given period of time, after which they initiate their malicious properties. 
In such cases, soon after an application exits its "dormant" stage, it will be detected and subsequently reported, 
generating a "vaccination reaction" throughout the network. 

IV. ALGORITHM, CORRECTNESS & ANALYSIS 

We shall now present the TPP algorithm. The algorithm is executed by each device separately and asynchronously, 
where no supervised or hierarchical allocation of tasks, as well as no shared memory are required. Following is a list 
of the main notations used in the presentation and analysis of the proposed algorithm. 

n The number of devices participating in the TPP algorithm 
7] Average number of applications each user downloads each month 
■q n Average number of "new applications" users download each month 
■q Average number of "old applications" users download each month 

X The number of alert messages generated and sent upon the discovery of a malicious application 
Pm The ratio ^ 

Pmax Penetration threshold — the maximal percentage of non- vaccinated devices allowed by the network operator 

£L False negative error rate of the local monitoring mechanism 

T Delay time between each two consecutive local monitoring processes 

N Number of new applications introduced to the network each month 

p ai The penetration probability of application aj 

1 — e Confidence level of the correctness of the convergence time estimation 
a The polynomial confidence level ln„ e _1 

Applications arrival rate (number of new applications per time unit) 
Xa Maximal applications arrival rate that a non-collaborative device can monitor 
Am Monitoring rate (number of applications monitored per time unit) 
At Number of algorithm's time steps per time unit 
timeout The Time-To-Live counter assigned to alert messages 

p Decision threshold — the number of alerts a device must receive in order to classify an application as malicious 

Cs Cost of sending a single message 

Cm Cost of locally monitoring a single application 



6 



A. TTL Probabilistic Propagation (TPP) — a Collaborative Monitoring Algorithm 

The TPP algorithm conceptually relies on the fact that in order to "vaccinate" a network with regards to malicious 
applications, it is enough that only a small number of devices will monitor this application. This way, although the 
application monitoring process is relatively expensive (in terms of battery and CPU resources), the amortized cost 
of monitoring each malicious application is kept to a minimum. A detailed implementation of the TPP algorithm 
appears in Algorithm 1. 

At its initialization (lines 1 through 6), all the applications installed on the device are added to a list of suspected 
applications. In addition, an empty list of known malicious applications is created. Once an application is determined 
as malicious, it is added to the known malicious applications list. In case this application was also in the suspected 
applications list (namely, it is installed on the device, but has not been monitored yet), it is deleted from that list. 
Once a new application is encountered it is compared to the known malicious applications list, and if found, an alert is 
sent to the user (alternatively, the application can be chosen to be uninstalled automatically). This feature resembles 
the long-term memory of the immune system in living organisms. If the new application is not yet known to be 
malicious, the application is added to the suspected applications list. 

Once executed, a periodic selection of an arbitrary application from the list of suspected applications is done, once 
every T steps (lines 23 through 31). The selected application is then monitored for a given period of time, in order 
to discover whether it is of malicious properties (see details about such monitoring in Section III) . If the application 
is found to be malicious, it is removed from the list of suspected applications and added to the known malicious 
applications list (lines 28 and 27). In addition, an appropriate alert is produced and later sent to X random devices. 
The alert message is also assigned a specific TTL value (lines 29 and 30). Once a network device receives such an 
alert message it automatically forwards it to one arbitrary device, while decreasing the value of TTL by 1. Once TTL 
reaches zero, the forwarding process of this message stops (lines 21 and 22). In case a monitored application displays 
no malicious properties, it is still kept in the list of suspicious applications, for future arbitrary inspections. 

A device may also classify an application as malicious as a result of receiving an alert message concerning this 
application (lines 14 through 20). In order to protect begins applications from being "framed" (reported as being 
malicious by adversaries abusing the vaccination system), a device classifies an application as malicious only after it 
receives at least p messages concerning it, from different sources (for a pre-defined decision threshold p). Note that 
when a device v receives an alert message concerning application a,i, it still forwards this message (assuming that 
TTL > 0), even when v has not yet classified as malicious (for example, when the number of alert messages received 
is still smaller than p). When the p-th alerting message concerning an application is received, the device adds the 
application to its known malicious applications list and removes it from the suspected applications list if needed. The 
selection of the optimal value of p is discussed in Section VI. The values of T, p and TTL, as well as the number of 
generated alert messages can be determined by the network operators, or be changed from time to time according to 
the (known or estimated) values of n and N, and the penetration threshold Pmax- Selecting an optimal value for 
TTL is discussed in Section IV. 



B. Optimal Parameters for Guaranteeing Successful Monitoring 

Outline of Analysis. In order to analyze the algorithm's behavior, we shall model the movements of the notification 
messages between the network's devices as random walking agents, traveling in a random graph G(n,p). Taking into 
account the fact that the messages have limited lifespan (namely, TTL), a relation between the size of the graph 
and the lifespan of the agents is produced. Having the value of TTL that guarantees a coverage of the graph, the 
algorithm's completion time, as well as the overall number of messages sent, can then be calculated. 

While analyzing the performance of the TPP algorithm we imagine a directed Erdos-Renyi random graph G(V, E) <~ 
G(n,p N ), where p^ — —■ The graph's vertices V denote the network's devices, and the graph's edges E represent 
messages forwarding connections. Notice that as G is a random graph, it can be used for the analysis of the performance 
of the TPP algorithm, although the message forwarding connections of the TPP algorithm are dynamic (notice again 
that an alternative model which does not require this randomness assumption is discussed in Section VIII). In 
addition, although the identity of the "neighbors" of a vertex v in the real network overlay may change from time to 
time (as the overlay graph can be dynamic), it can still be modeled by static selection of X random neighbors of v. 

Observing some malicious application ai, every once in a while some device which ai is installed on randomly selects 
it for monitoring. With probability (1 — E-) the device discovers that ai is malicious and issues alerts to X network's 
members. We look at these reports as the system's "agents" , and are interested in finding : 

• The time it takes the graph to be explored by the agents. Namely, the time after which every device was visited 
by at least p agents (and is now immune to ai). 
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1: Initialization : 

2: Let A(v) be the list of installed applications 

3: Let A(v) be the list of suspected applications 

4: Let A(v) be the list containing known malicious applications 

5: Initialize A(v) <- A(v) 

6: Initialize A(v) <— 

7: Interrupt upon encountering a new application at : 

8: A{v) <- A(v) U {a«} 

9: If a; £ A(u) then 

10: A(u) <- \{a 4 } 

11: Issue an alert to the user concerning ai 

12: End if 

13: Interrupt receives malicious application ai notice, for the j-th time : 

14: If j > p then 

15: A(v) <- A(v) \ {m} 

16: A(v) <- A(v) U {a,} 

17: If a; £ A(t>) then 

18: Issue an alert to the user concerning ai 

19: End if 

20: End if 

21: Decrease TTL of report by 1 

22: Forward report to a random network member 

23: Execute every T time-steps : 

24: Select a random application ai from A(v) for monitoring 

25: If ai is found to be malicious then 

26: Issue an alert to the user concerning ai 

27: A(v) <- A(v) U {a,} 

28: <- A(v) \ {a,} 

29: Report to X random network members 

30: Set TTL = timeout 

31: End if 

Algorithm 1: TTL Probabilistic Propagation 



• Total number of messages sent during this process. 

• The minimal TTL which guarantees a successful vaccination of the network. 

Note that the agents have a limited lifespan, equals to timeout. As the graph is a random graph, the location of the 
devices in which ai is installed is also random. Therefore, as they are the sources of the agents, we can assume that 
the initial locations of the agents are uniformly and randomly distributed along the vertices of V . In compliance with 
the instruction of the TPP algorithm, the movement of the agents is done according to the random walk algorithm. 

Application ai is installed on n ■ p ai devices, each of which monitors a new application every T time steps. Upon 
selecting a new application to monitor, the probability that such a device will select ai is jj. The probability that 
upon monitoring a 4 the device will find it to be malicious is (1 — E_), in which case it will generate n ■ alerting 
messages. The expected number of new agents created at time t, denoted as k(t), therefore equals : 

and the accumulated number of agents which have been generated in a period of t time-steps is therefore k t — 

The value of timeout (the assigned TTL) is selected in such a way that the complete coverage of the graph, and 
therefore its vaccination against ai, is guaranteed (in probability greater than 1 — e). We now artificially divide the 
mission to two phases, the first containing the generation of agents and the second discussing the coverage of the 
graph. Note that this division ignores the activity of the agents created in the second phase. Note also that the fact 
that the agents are working in different times (and in fact, some agents may already vanish while others have not even 
been generated yet) is immaterial. The purpose of this technique is to ease the analysis of the flow of the vaccination 
process. 

Denoting the completion time by Ty ac we therefore have : 

^Vac — ^Generation ~t~ ^Propagation 

It is easy to see that essentially Tp ropagation = timeout. We now artificially set : 

timeout = A • (T GeneraUon + timeout) 
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^Generation (1 ^) (TQ enera tio7i timeout) 

From this we can see that : 

_ (1 - A) 

-'-Generation — ^ ' tZTflCOUu 

We later demonstrate an upper bound for timeout, based on the number of agents created in t < Tceneration 
(ignoring the activity of the agents created between t — Tceneration and t = Tceneration + timeout) . 

We now examine the case of A = 0.5 (which we show to be the optimal selection for A, in the paper's Appendix). 
In this case, we can now write: T Vac < 2 • timeout. 

Let us denote the number of agents created in the first Toeneration time-steps by k — kT Generation ■ We now find the 
time it takes those k agents to completely cover the graph G, and from this, derive the value of timeout. 

Let us denote by p- coverage of a graph the process the result of which is that every vertex in the graph was visited 
by some agent at least p times. 

Theorem 1. The time it takes k random walkers to complete a p-coverage of G in probability greater than 1 — e 
(denoted as T(n)) can be bounded as follows : 

2(p-hi A ) 2(p-ln^) 
— — < T(n) < — —f±- 



1-e w-ttn) 1-e 2- 

Proof. Sec Appendix for complete proof. □ 

We now show how to select a value of timeout that guarantees a successful vaccination process : 

Theorem 2. For every values of timeout that satisfy the following expression, the TPP algorithm is guaranteed to 
achieve successful vaccination for any penetration threshold Pmax, in probability greater than 1 — e ; 

Hp-^vJ = 1 



timeOUt{\ - e -i^o»t. ^MirM (1 _ E _ )) 



Proof. Recalling the expected number of agents generated at each time step, the expected number of agents k that 
appears in Theorem 1 equals : 



t i o ? i. 



A successful termination of TPP means that the penetration probability of (any) malicious application is decreased 
below the threshold pmax- Until this is achieved, we can therefore assume that this probability never decreases below 
Pmax ■ 

Vt < T GeneraUon : p a , > Pmax 
Therefore, we can lower bound the number of agents as follows : 

k > timeout ■ Pmax — P^_n _ £j\ 

T-N v ; 



Assigning timeout = m into Theorem 1, successful vaccination is guaranteed for : 

2(P-In^) ^ 2(p-ln£) 



timeout — ^- < 



and the rest is implied □ 
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C. Number of Messages and Time Required for Vaccination 



From the value of timeout stated in Theorem 2, the vaccination time Ty ac as well as the overall cost of the TPP 
algorithm can now be extracted. The cost of the algorithm is measured as a combination of the overall number of 
messages sent during its execution and the total number of monitoring activities performed. Let us denote the cost 
of sending a single message by Cs and the cost of executing a single local monitoring process by Cm- 

Observation 1. For any timeout = t which satisfies Theorem 2, the time and cost of the TPP algorithm can be 
expressed as : 

T V ac = 0(t) , M = 0(k-r-C s + %C M ) = 



o( P -^f-{l-E-)-U.C s + — -C M 
\ n Z T ■ N \ n ■ p N 



Theorem 3. The time it takes a network that implements the TPP algorithm to guarantee vaccination against all 
the malicious applications from an N available applications is upper bounded as follows : 



_ ^ A T-N(p+ a + l inn _/ . T ■ N 

Tvac < 4 J ./ ; = O [p + In n + — — - — 

V n ■ Pmax ■ Pn ■ (1 - E-) \ Pmax ■ (1 - E-) ■ Inn 

Proof. Let us assume that e is polynomial in ^, namely : e = n~ a s.t. a e Z+. 

Using the bound (1 — x) < e~ x for x < 1 wc can see that when assuming [55] that : 

timeout ■ — 1 — £/_)< 1 

Theorem 2 can be written as : 

p + (a + 1) Inn > timeout 



2 n- pmax ■ Pn ■ (1 - E-) 



AT ■ N 
and therefore : 



1 4T ■ N(p+ (a + l)lnn) 
n-pMAX ' Pn ' (1 E-) 



timeout < 

Assigning this approximation of timeout into the assumption above, yields : 



2(p+ (a + l)lnn) < 



y n-pMAX -PnO- - E-) n-pMAX ■ Pn{^ - E_) 

which is satisfied for the following values of p N : 

T-N 

PN n -pmax ■ (p+ (a + l)lnn)(l - EJ) 

For constant (or smaller) values of p max and number of applications larger than O(lnn) we can safely assign p^ = 
As the vaccination time equals twice the value of timeout, the rest is implied. 

Assigning the upper bound for p^ into Theorem 3 immediately yields 0(p + Inn). When assigning as a lower 
bound for p^ which guarantees connectivity [12], the following expression is received : 



'-Vac 



<4, 



IT- N ■ (p+(a + l)Inn) 



Inn -pmax ■ (1 - E_) 
However, using the upper bound for p N derived above, we see that : 

Inn T-N 

< 



n n ■ pmax ■ (p + (a + 1) lnn)(l - EJ) 
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which in turn implies that : 



Combining the two yields : 



T ■ N 

p+ (a + l)lnn < 



Tyac — O 



Pmax ■ (1 - E_) ■ Inn 



T-N 

Pmax • (1 - E-) ■ Inn 



Note that although O ^ Pma ^'(\-e ) ) ^ s allegedly independent of n, assigning the connectivity lower bound P N > ^ 
into the upper bound for pm we can see that : 



T ■ N 

fl(p Inn + In 2 n) 



PMAX (I - E-) 

□ 

Theorem 4. The overall cost (messages sent and monitoring) of a network that implements the TPP algorithm for 
guaranteeing vaccination against all the malicious applications from an N available applications is upper bounded as 
follows : 

k 

M < k ■ timeout ■ Cs + — • Cm < 



< An (p+ (a + 1) In n)C s + 2C M \ 



In (p + (a + 1) Inn) • pmax ■(!-£'- 



Pn-T-N 



O {{up + n\nn)C s + + n(p + lnn) P ^ (1 ~ ^ ) C M 

Proof. When the vaccination process is completed, no new messages concerning the malicious application are sent. 
The above is received by assigning the approximated value of timeout into Observation 1. □ 

In networks of < 1 — o(l), provided that [56] p = O(lnn), and remembering that in this case PMA ^^ E \ = 
fi(ln 2 n) Theorem 4 is dominated by : 



/ Tl 

M = O n In nC s + ; C M 

\ Inn 



V. MAXIMAL LOAD OF THE TPP ALGORITHM 

We shall now calculate the maximal load of the TPP algorithm, namely, the maximal rate at which new applications 
can be introduced into the system, while guaranteeing that each of them would still be monitored by enough devices. 

Definition 1. Let \a denote the rate at which new applications are introduced to the system (units are # applications 
per time units). 

Definition 2. Let Am denote the rate at which each device monitors an arbitrary new application (units are #appli- 
cations per time units). 

Definition 3. Let At denote the rate at which alert messages are processed throughout the system (namely, how many 
"time steps" pass per time unit). 

We can now derive the maximal load of the system, as follows : 
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Theorem 5. An upper bound for the maximal applications load Xa is : 

, n-pMAX -PN ■ (1 - E-) j 
A " 16(p+(a + l)Inn) ' T ' 

2/9 + 2(a + l)lnn 



= min { X T , 



1 - e 2 d-nrsld-Piv-i) J 

Proof. Interpreting A^ 1 as the number of months it takes to process a single step of the algorithm, we can now 
assign TV = A^ and T = A^ 1 into Theorem 3, requesting that the upper bound over Ty ac shall be kept smaller than 
1. Xf denotes the maximal number of steps that may be required by the algorithm to complete the proliferation 
of the alerting messages produced in the last monitoring of each month, calculated using the lower bound part of 
Theorem l.[57] □ 

Observation 2. For p = O(lnn) and constant values of ' pmax and then for p^ > the maximal load is 
monotonically increasing with n. 

We can now calculate the algorithm's benefit factor, comparing the system's maximal load to the maximal load of 
any non collaborative monitoring algorithm : 

Corollary 1. Let us denote by Xa the maximal load of a single device. The ratio between the maximal load of 
applications a single device can non- collaboratively monitor and the maximal load of the TPP algorithm is : 

Aa ,2 /-, \ n-p N 



Proof. We can easily see that : 



— A= • (1 - pmax) ■ Pmax ... , , 
Xa 16 (p+ (a + ljlnn) 



Xa = (1 — E_) • • Am 

J- - Pmax 



Dividing A^ by A^ the rest is implied. □ 

Observation 3. For p = O(lnra) and constant values of pmax and E_, then for p^ > and regardless of the 
monitoring rate Am the benefit factor of participating in the collaborative TPP monitoring algorithm is asymptotically 
greater than 1, and is monotonically increasing with n. 

Numeric Example I (Large Network) : assume a network of n = 1,000,000 devices who participate in the 
collaborative effort. We shall require that Pmax = 0.01 (namely, that 99% of the network must be fully vaccinated) 
and that the reliability of the analysis should be 99.9% (as n = 1,000,000 it means that a = 0.5). Disregarding 
adversarial attacks, we can use p = 1, and in addition, we shall assume that the error probability is very small. We 
also assume that the time that is required for a message to be processed and forwarded is approximately 5 seconds 
(namely, Xt = 720, assuming time units of hours). Upon discovery of a malicious application, we shall assume that a 
device sends this information to 400 network members (namely, pn — ^ioo)- We shall also assume that each device 
can monitor a single application per week (namely, Am = ^hf)- Using Theorem 5 we see that A^ < 35.51163 which 
equals to 25,566 new applications per month! [58] 

As to the number of messages each participating device is required to send, using Theorem 4 we can see that each 
device sends on average of at most 6 • Inn messages during each month, namely — 2.763 messages on average per day. 
Comparing this to the performance of a single device relying solely on a non collaborative host-based monitoring, 
such a device could cope with monitoring less then 5 new applications each month, as A^ < 0.006). This reflects a 
benefit factor of more than 5,000 in the maximal load of suspicious applications. 

Numeric Example II (Reality Mining) : In order to assess the performance of the algorithm for real-world 
networks, we have used the data generated by the Reality Mining project [17]. This project analyzed 330,000 hours of 
continuous behavioral data logged by the mobile phones of 94 subjects, forming a mobile based social network. Using 
this network, we can estimate the efficiency of the TPP algorithm, as follows. Based on the Reality Mining data we 
can calculate that n = 94, a = 1.52 (in order to guarantee a 99.9% confidence level) and pm = 0.092123. In addition, 
as in the previous example, we assume that each device can monitor a single application per week (Am = 
and that the time required for a message to be processed equals 5 seconds (At = 720 and A^ = 139) and were no 
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adversaries are present (p — 1). Then the maximal load of the system Xa as a function of the penetration threshold 
Pmax equals (using Theorem 5) : 

X A < 4.997 • pmax 
The maximal load of the algorithm for this network appears in Figure 2. 



Collaborative Vaccination for the Reality Mining Network 




Maximal Load 



* 94 mobile phones 



* 360 applications per month (10% penetration threshold) 

* 190 applications per month (5% penetration threshold) 

* 72 applications per month (2% penetration threshold) 



Network Overhead 



' 45 messages per unit per month 



Non Collaborative Monitoring 



' 5 applications per month 

' Knowledge gained thro ugh experience (hence - large infection probability) 




FIG. 2: Maximal monthly load of the TPP algorithm implemented over the Reality Mining network. Note that the maximal 
load increases as the vaccination mechanism loosens (namely, guarantees vaccination for a slightly smaller portion of the 
network). For example, where for 98% vaccination the algorithm guarantees monitoring of 72 new applications per month, 
when we require that only 90% of the network's devices are vaccinated, the number of applications that can be monitored each 
month increases to 360. 



It is important to notice that although the number of applications the average user downloads each month is far 
smaller than 5,000, the importance of the collaborative monitoring scheme is still high, as the main merit of the 
proposed scheme lays in its predictive ability. Namely, whereas a user downloading 5 applications each month may 
locally monitor them successfully, should one of them be malicious, the user will fall victim to its malicious behavior. 
However, should users that participate in the proposed collaborative monitoring scheme try to download a malicious 
application, they will be immediately notified about the maliciousness of the application (in high probability), before 
they are exposed to its dangers. Indeed, the first users that encounter a new malicious application may fall victim to 
its perils — an unavoidable property of any host-based monitoring system. However, using the proposed collaborative 
scheme this is kept to a minimum. 
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VI. AVOIDING BENIGN APPLICATIONS "FRAME" THROUGH ADVERSARIAL USE OF THE TPP 

VACCINATION ALGORITHM 



As mentioned in previous sections, the TPP algorithm is fault tolerant to the presence of adversarial devices which 
try to compromise the system's integrity by causing a large enough portion of the network to consider (one or more) 
benign applications as malicious. This aspect of the algorithm is crucial, as it is a collaborative algorithm which relics 
on the participation of as many mobile devices as possible — devices who should be guaranteed that the efficient 
collaborative monitoring will not be accompanied with erroneous results. In the TPP algorithm, this fault tolerance 
is being provided by the introduction of the p "decision threshold" into the decision system. Namely, the fact that 
an application is being marked as malicious only when at least p relevant messages (from different origins) are being 
received. Relying on the common agreement of several devices as a tool for overcoming system noise (which can be 
cither coincidental or intentional) is often used in swarm based systems. For example, a similar mechanism called 
"threshold cryptography" is used for enabling the collaborative use of cryptographic tasks (see for example [24, 40]. 

Definition 4. Let us denote by P Attack (TT L, p, ^,e) the probability that a "framing attack" done by a group of k 
organized adversaries will successfully convince at least e ■ n of the network's devices that some benign application cij 
is malicious. 

A trivial example is the use of very large values for TTL, which allow a group of k adversaries to convince the 
entire network that any given application is malicious, provided that k > p, namely : 

Vfc>l , Vp<k , Ve<l , lim P Attack (TTL,p,-,e) = 1 

TTL^ca n 

Theorem 6. The probability that k attackers will be able to make at least an e portion of the network's devices treat 
some benign application ai as malicious, using the TPP algorithm with a decision threshold p is : 

P Attack [TTL, p, k , e ) < 1 - $ \^n~ • 



n 



where : 

p e ( P -TTMi-e-^)) ^ TTL ■ (1 - e-^) J 
and where $(x) is the cumulative normal distribution function, defined as : 

1 f X 1 2 

<Z>(x) = —= I e-^dt 

V27T J-oo 

and also provided that : 

Proof. We use Lemma 2 to calculate the probability that a device v eV will be reported of some malicious application 
ai by a message sent by one of the k adversaries at the next time-step. This is yet again a Bernoulli trial with : 

p s = 1 — e 2 " = 1 — e 2 
Denoting as X v {t) the number of times a notification message had arrived to v after t steps, using Chernoff bound : 

(1 + <5)( 1 + 5 ), 



P[X v (t) > (1 + S)t- Ps ] < 



in which we set S — ^ 1. We can therefore sec that : 



P ± P A ttack(TTL,p, k -,n^) = P[X V {TTL) > p] < e^ TTL ^ ■ (j^j^ 
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It is important to note that the Chernoff bounds requires that 5 > 0. This is immediately translated to the 
following requirement : 

p>TTL(l-e s-) 

As we want to bound the probability that at least en of the devices are deceived, we shall use the above as a success 
probability of a second Bernoulli trial. As n is large, the number of deceived devices can be approximated using 
normal distribution : 



(TTL,p,^,s^ 



PAttack[TTL,p,-,e) < 1-$ 



£ • n — n ■ P 



l n-P(l-P) i 

and the rest is implied. □ 

Theorem 6 is illustrated in Figures 3 and 4. Figure 3 presents the execution of the vaccination algorithm in a 
network of 1,000,000 devices, where each device is connected to 50 neighbors. In this example we require that the 
number of devices that may be deceived by adversarial attacker would be at most 100. With the decision threshold 
p properly calibrated according to Theorem 6, we show that as long as the number of adversaries is below 480 they 
cannot launch a successful attack. However, as the number of adversaries increases, such an attack quickly becomes 
a feasible option. Specifically, by increasing the number of adversaries by 3% (from 480 to 494) the probability of a 
successful attack rises from to 1. In order to compensate the increase in adversaries number, devices operating the 
TPP algorithm have to simply increase the value of p by 1. Figure 4 shows how each small increase in the value of the 
decision threshold p requires the adversaries to increase their numbers by approximately 7%. Interestingly, the effect 
of such changes in the value of p on the algorithm's load and overhead is very small, as can be observed in Figure 5. 

Note that in the proof of Theorem 6 we assumed that the adversarial devices may decide to send a false message 
concerning an application's maliciousness, but they must do so using the definitions of the TPP algorithm. Namely, 
each device may send at most pjy • n messages, and send these messages to random destinations. If adversarial 
devices had been allowed flexibility in those constraints as well, a small group of adversarial devices could have sent 
an infinitely large number of false messages, that would have been propagated throughout the network, resulting in 
a successful attack (similarly to the case where TTL — » oo). Alternatively, had adversarial devices been allowed to 
chose the destination of the pm • n messages they send, they could have all send them to the same destination v, thus 
guaranteeing that v would be deceived. More generically, a group of k = l • p of adversarial devices could have send 
their messages to i ■ p^ • n different devices, guaranteeing their deception. 

Definition 5. Let us denote by PAttack- Destination (TTL, p, ^,e) the attack success probability when adversarial 
devices are allowed to control the destination of the messages they produce. 

The following Corollary can be drawn : 

Corollary 2. The probability that k attackers that can control the destination of the pn • n messages they produce 
will be able to make at least an e portion of the network 's devices treat some benign application aj as malicious, using 
the TPP algorithm with a decision threshold p is : 

PAttack- Destination [TTL, p, — ,£ J < PAttack [TTL — 1, p, — , £ • PN) 

V n J V n p ) 



VII. FAULT TOLERANCE TO "LEECHING" AND "MUTING ATTACKS" 

Collaborative systems, by their nature, are based on the fact that the participants are expected to allocate some 
of their resources (computational, energy, storage, etc') to be used by some collaborative algorithm. However, what 
happens when users decide to benefit from the system's advantages without providing the contribution that is expected 
from them ? This behavior, known as leeching is a known artifact in many Peer-to-Peer data distribution systems, 
in which users often utilize the system for data download, without allocating enough upload bandwidth in return. A 
similar behavior can also be the result of a deliberate attack on the system — a muting attack. In this attack, one or 
more participants of the system block all the messages that are sent through them (namely — automatically decrease 
the TTL of those messages to zero). In addition, no original messages are being generate by these participants. The 
purpose of this attack is to compromise the correctness of the TPP vaccination algorithm, which relies on the paths 
messages of a given TTL value are expected to perform. 
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#Units n=1,0OD.DOO 
#Applications N = 500 
Edge probability = 0.00005 
Messages TTL TTL = 1.600 
Decision threshold rho = 41 
Penetration threshold pMAX = 0.001 
Deception treshold epsilon = 0.0001 
Confidence level 99.9% 


J 




Number of adversarial 



FIG. 3: An illustration of Theorem 6 — an upper bound for the success probability of a collaborative "framing attack" as 
a function of the number of adversarial devices. In this example, a changing number of adversaries are required to deceive 
at least 100 devices to think that some benign application is in fact malicious. Notice the phase transition point around 485 
adversarial device. 
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FIG. 4: An illustration of Theorem 6 — the influence of the value of decision threshold p on the number of adversarial devices 
required for a successful attack. Note how as the adversaries increase their numbers, all that has to be done in order to prevent 
them from launching successful attacks is simply to slightly increase the value of the decision threshold p. This makes the TPP 
algorithm economically favorable for networks operators. From experimental observations it can be seen that increasing the 
value of p results only in slight increase in the vaccination time and network overhead, which converge asymptotically, whereas 
the resources required in order to launch stronger attacks grow approximately linearly. See more details in Figures 5 



16 



Either performed as a way of evading the need to allocate resources for the collective use of the network, or 
maliciously as an adversarial abuse of the system, this behavior might have a significant negative influence on the 
performance of the TPP algorithm. The operators of the network therefore need to have a way of calibrating the 
system so that it will overcome disturbances caused by any given number of participants who select to adopt this 
behavior. In this section we show that in terms of completion time the TPP algorithm is fault tolerant to the presence 
of blocking devices up to a certain limit. 

Definition 6. Let p mute denote the probability that a given arbitrary network device stops generating vaccination 
messages or starts blocking some or all of the messages that are sent through it. 

We now show (Corollary 3) that the expected vaccination time remains unchanged as long as : p mu te ^ p \nn • ^ or 
higher values of p m ute Theorem 7 presents an analytic upper bound for the algorithm's expected completion time. 
For extremely high values of p mute the completion time of the algorithm is shown in Corollary 4 to be upper bounded 

by i 4 ?™*' • Ty ac (denoting by Ty ac the vaccination time with the presence of no blocking devices). 

As to the overall cost of the TPP algorithm, it is shown in Corollary 5 to remain completely unaffected by the 
presence of blocking devices, regardless of their number. Due to space considerations, some of the proofs were omitted 
and can be found in the Appendix. 

Definition 7. Let us denote by T(n,p mute ) the vaccination time of a network ofn devices, with a probability ofp mute 
to block messages. 

Theorem 7. The vaccination completion time of the TPP algorithm for some critical penetration p max in probability 
greater than 1 — e, while at most n ■ p mute devices may block messages forwarding and generation, is : 

T(n, Pmute) = — 



^ g Pmute 2T-N-(1 — E_)~ 1 

while for the calculation of timeout we can use the expressions that appear in Theorem 2 or Theorem 3. 

Proof. See Appendix for complete proof. □ 

We shall now observe the behavior of the expression above for various values of the ratio r mu t e defined as p mu te • 
timeout. We shall note three complementary cases : r mute < 1 or r mute 3> 1 or r mute s=s 1. 

It is easy to see that when r mute <C 1, the decay of the number of messages is negligible, namely : 

i _ _ -timeout-p mut e 

- 1 - Pmute c . . . 

rts timeout 

Pmute 

As a result, Theorem 7 can be approximated by Theorem 2. Subsequently, the Theorems and Corollaries that are 
derived from Theorem 2 would hold, according to which we can see that : p mu te *C o(piogn) ■ 

Based on the above, we can now state the fault tolerance of the TPP algorithm, with respect to the muting attack : 

Corollary 3. The TPP algorithm is fault tolerant with respect to the presence mute mobile devices 

(for some c«lj. Namely : 

T{n lP mute)~T{n,Q) 

Let us now observe the case where r mu te >• 1. In this case, most of the messages are likely to be blocked before 
completing their TTL-long path. This results in an increased vaccination time, as shown in the following Corollary : 

Corollary 4. When the number of blocking devices is greater than c ■ O (^jj^ij (for some c » 1), the completion 
time of the TPP algorithm is affected as follows : 

T(n,p mute )< ^ mute .T(n,0) 2 

Pmute 

Proof. Sec Appendix for complete proof. □ 
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Note that for every other value of r mute which revolves around 1, the expected vaccination time would move 
between T(n, 0) and ^""^ • T(n, 0) 2 (see more details concerning the monotonic nature of T(n,p mute ) in the proof 
of Corollary 5). 

Let us now examine the affect blocking devices may have on the number of messages sent throughout the execution 
of the algorithm. As shown in the following Corollary, the overall cost of the TPP algorithm remains unaffected by 
the presence of any given number of blocking devices. 

Definition 8. Let us denote by M(n,p mute ) the overall cost of the TPP algorithm (messages sent + monitoring) for 
a network of n devices, with a probability of p mute to block messages. 

Corollary 5. The overall cost of the TPP algorithm is unaffected by the presence of blocking devices. Namely : 

Vpmute < 1 , M(n,p mute ) = M(n,0) 
Proof. See Appendix for complete proof. □ 



VIII. VACCINATION IN GENERAL GRAPHS USING NO OVERLAYS 

An interesting question is raised when the forwarding of notification messages between the vertices is not assumed 
to be done using a random network overlay, but rather — using only the edges of the network graph. This is also 
motivated by works such as [11] and [23] which show how network topology may play an important role in the 
spreading of virus / malware (and subsequently, also of the vaccination messages). 

In addition, we shall assume that we have no knowledge concerning the structure of the network graph, or its type. 
We now show that even in this case, the TPP algorithm can still be used, albeit with a much larger completion time. 

Theorem 8. Using the TPP algorithm in an unknown graph with no network overlay, to guarantee a successful 
vaccination process for some critical penetration pmax , the following completion time is obtained : 

Tvac = O ( ( ^'f F 3 ■ n-i ■ log* n J 

\\PMAX ■ (1 - EJ)) J 

Proof. We shall first observe the following upper bound concerning the exploration time of a general graph using a 
decentralized group of k random walkers [7] : 



E(ex G ) = O 



E\ 2 log 3 n 
P 



Note that the coverage time of random walkers in graphs is also upper bounded by ^n 3 + o(n 3 ) [20]. However, as 
we assume that p^ < 0(n~(°' 5+<! )) (for some e > 0), using the bound of [7] gives a tighter bound. Using our lower 
bound over the number of agents for k, and as \E\ = n ■ pm we get : 

ev \ n ( u2 Pn 1o S 3 71 \ 
E(ex G ) = O 2 -j- - 

\timeout 2 ■ P T M 2 A */ N (l - E-) 2 J 

Multiplying the exploration time by p (in order to guarantee that each device will get at least p messages) and 
replacing E(exc) with timeout we see that : 



'-Vac 



o 



p' ( P MAx T 'a-B-)) . . 

\ n 2 l0g(n) 

V / 



Assuming also that p = O(lnn), and the rest is implied. □ 
The overall cost of the algorithm in graphs with no overlays is given in Theorem 9. 
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Theorem 9. Using the TPP algorithm in an unknown graph with no network overlay, to guarantee a successful 
vaccination for some penetration threshold p max , the overall cost of the algorithm is upper bounded by the following 
expression : 



O 



T-N 



Pmax • (1 - EJ) 



■ n 3 • log 3 n ■ Cs + 



• log 3 n 



r • Cm 

I PI f T-N V 

' \pmax-(1-E-) J ) 



Proof. As to the overall cost of the algorithm, we know that : 



M = [k - timeout ■ C s + — ■ C M 



Using the expressions for timeout, k and pn, the rest is implied. 



□ 



IX. EXPERIMENTAL RESULTS 



We have implemented the TPP algorithm in a simulated environment and tested its performance in various sce- 
narios. Due to space considerations, we present in this Section only a sample of the results we have obtained. We 
have tested a network of n = 1000 devices, having access to TV = 100 applications, one of which was malicious [59]. 
We assume that each device downloads 30 random applications, monitors 1 application every week, and can send 
notification messages to 10 random network members (namely, pn = 0.01). We require that upon completion, at 
least 990 network members must be aware of the malicious application (namely, pmax = 0.01), with e = 0.001. In 
addition, we assumed that among the network members there are 100 adversaries, trying to deceive at least 50 mobile 
devices to believe that some benign application is malicious. 

Figure 5 shows the time (in days) and messages required in order to complete this monitoring, as a function of 
the decision threshold p. We can see that whereas the adversaries succeed in probability 1 for p < 3, they fail in 
probability 1 for any p > 3. Note the extremely efficient performance of the algorithm, with completion time of ~ 260 
days using only 5 messages and at most 30 monitored applications per user. The same scenario would have resulted 
in 100 messages per user using the conventional flooding algorithm, or alternatively, in 700 days and 100 monitored 
applications per user using a non-collaborative scheme. Notice how the increase in the time and network overhead as 
a result of an increase in p (in order to compensate an increase in the number of adversaries) converge asymptotically. 
This makes the algorithm economically scalable, as discussed in Section VI. 

Figure 6 demonstrates the decrease in completion time and network overhead as a result of increasing the pene- 
tration threshold pmax- Figure 7 demonstrates the evolution in the malicious application's penetration probability 
throughout the vaccination process. An interesting phenomenon is demonstrated in Figure 8, where the number of 
adversarial devices is gradually increased, and their success in deceiving 5% of the network's members is studied. As 
the deception rate increases linearly, the success to generate a successful attack displays a phase transition — growing 
rapidly from "a very low attack success probability" to "very high attack success probability" with the increase of 
only 20% in the number of adversaries. 

As stated in previous sections, the main contribution that stems from implementing the TPP algorithm is the 
fact that the vast majority of the mobile devices become immune to any new malicious applications that may be 
introduced to the network. We have implemented a network of 70,000 units based on the aggregated calls-graph that 
was received from an actual mobile network. Our simulation assumed that each user downloads an average of 50 
applications per month, out of which r\ n — 5 new applications. To this model we have periodically injected a small 
number of malicious applications (approximately 0.5% of the applications). In this simulation we used the well known 
SIR epidemic model, with a slight variation — as the network is dynamic, we assumed that there is a constant flow 
of users leaving the network, which is compensated by a flow of users joining the network. This allows the network 
to avoid the "everyone dies eventually" syndrome, by constantly trading a portion of the recovered devices for new 
susceptible ones. Notice that a device may become infected cither as a result of downloading a malicious application 
or as a result to an exposure to an infected device. In both cases, a mandatory condition for the infection is that the 
device was not vaccination against this application already. Figure 9 presents the dynamics of the network, comparing 
the number of infected and damaged devices when the collaborative vaccination mechanism is present to the number 
of infected and damaged mobile devices without it. Notice how the proposed algorithm decreases the number of 
infectious devices at the steady state of the network, as well as the accumulated number of malicious incidents, by 
approximately 75%. 
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FIG. 5: An experimental result of a network of n = 1000 members, with N = 100 applications, Pmax = 0.01, pjv = 0.01 and 
100 adversaries that try to mislead at least 5% of the network into believing that some benign application is malicious. Notice 
how changes in p dramatically effect the adversaries' success probability, with almost no effect on the completion time. 




FIG. 6: The effect of decreasing the penetration threshold pmax on the algorithm's completion time and number of messages 
(P=l)- 

X. CONCLUSIONS AND FUTURE WORK 

In this work we have presented the TPP TTL-based propagation algorithm, capable of guaranteeing the collab- 
orative vaccination of mobile network users against malicious applications. The performance of the algorithm was 
analyzed and shown to be superior compared to state of the art in this field, guaranteeing higher number of suspicious 
applications that can be monitored concurrently, and do so using a lower network overhead. 

The algorithm was also shown to be capable of overcoming the presence of adversarial devices who try to inject 
messages of false information into the network. The challenge of filtering out false information which is injected into a 
collaborative information propagation networks resembles the known "faulty processors" problem. For example, the 
following work discusses the challenge of synchronizing the clock of a communication network of size n when ^ faulty 
processors are present [6]. Another interesting work in this scope is the work of [10] which discusses a collaborative 
fault tolerant "acknowledgement propagation" algorithm, for reporting on the receipt (or lack of) of sent messages. 

It should be noted that during the analysis of the fault tolerance of the algorithm, we assumed that although an 
attacker can send reports of benign applications, or alternatively — refuse to forward messages passed through it, 
the basic parameters of the algorithm are still preserved. Namely, adversarial devices are not allowed to send more 
than X messages or generate messages with TTL values higher than the value allowed by the network operator. In 
addition, we assumed that while an adversary can "kill" messages sent to it by refusing to forward them, it cannot 
however interfere with the content of the message itself (e.g. by changing the identity of the message's original sender, 
its current TTL, or the identity of the application it reports of). The exact implementation details of a cryptographic 
protocol of these properties, however, is out of the scope of this work, 

Future versions of the work should investigate the generalization of the propagation mechanism, allowing the number 
of alert messages generated to be dynamically calibrated in order to further decrease the algorithm's overhead (for 
example, as a function of the number of alerts already received concerning the application). Alternatively, upon 
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FIG. 7: The penetration probability of the malicious application, as a function of the time, with p = 1 (on the left) and p = 20 
(on the right). 
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FIG. 8: An illustration of the phase transition that is displayed when observing the influence of the number of adversaries over 
their probability to successfully generate an attack. Note how an increase of 20% in the number of adversaries increases the 
probability to deceive a large enough portion of the network from less than 0.2 to approximately 0.8. 

With No Collaborative Monitoring With TPP Vaccination Mechanism 




FIG. 9: An experimental demonstration of the benefits of using the proposed collaborative monitoring algorithm. Implemented 
on a network that is based on real-life mobile network of 70,000 devices, the TPP algorithm significantly decreases both the 
number of infectious devices as well as the accumulated number of malicious incidents at the network (i.e. damages caused by 
activations of malicious applications that were installed on mobile devices). 
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forwarding a message, devices might be requested to send more than a single copy of the message they received. 

Another interesting topic to investigate is the use of the mechanism proposed in this work as an infrastructure for 
other security related problems. One example can be the problem of collaboratively coping with malicious beacons 
in hostile wireless environment, as discussed in [33]. Most of the existing localization protocols for sensor networks 
are vulnerable in hostile environments, requiring for the enhancement of the security of location discovery. The work 
of [33] presents voting-based methods to tolerate malicious attacks against range-based location discovery in sensor 
networks. This problem seems to benefit from the use a mechanism which is fault tolerant to the injection of false 
information, such as the algorithm we propose in this work. 

Appendix A: Proof of Theorem 1 

THEOREM 1 The time it takes k random walkers to complete a p-coverage of G in probability greater than 1 — e 
(denoted as T(n)) can be bounded as follows : 

2(p-ln-M 2(p-ln-M 
K -_ 3 r < T(n) < ) P _?> 
l- e 2n(1 -nrs' 1 - e 2" 

Proof. Since our bounds are probabilistic, we can state that the following "bad events" occur with very low probability 
(e.g. 2" w (")). Event E x 

ow degrees defined as the existence of a vertex v G V with degiy) < ^™ Event Ehigh degree-, 
defined as the existence of a vertex v € V with deg(v) > 3 '" 2 Pw . Using the Chernoff bound on G we get : 

prob[aeg(v) < — - — J < e 8 

Applying union bound on all vertices we get : 

Prob[E low degree ] < n ■ e-*W- < 2~ u ^ 

Similarly : 

Prob[E high degree ] < 2-"< n > 

From now on we assume that Ei ow degree and Ehigh degree do not occur, and condition all probabilities over this 
assumption. In the private case of G V , deg(v) — p N ■ n, every analysis that is based on the expected number of 
neighbors shall hold. 

In order to continue analyzing the execution process of the TPP algorithm we note that as the initial placement 
of the agents is random, their movement is random and the graph G is random, we can see that the placement of the 
agents after every-step is purely random over the nodes. Using these observation, the number of agents residing in 
adjacent vertices from some vertex v can be produced : 

Lemma 1. Let v £ V be an arbitrary vertex of G. Let Ni(v,t) be the number of agents which reside on one of 
Neighbor(v) (adjacent vertices to v) after step t : 

Vt>0:^*>£?[M(M)]>^ 

In other words, the expected number of agents who reside in distance 1 from v after every step is at least and 
at most 

Proof. Upon our assumption, in G(n,p N ) the number of incoming neighbors for some vertex v is at least |pjv • n and 
at most |pjv • n. In addition, for every u € V{G), Pro6[some agent resides on u] = K Combining the two we see 
that : 

Vt > : EiN^t)} > PN 2 ^' k >\pN-k 
Vt>0:E{N 1 (v,t)]< 3pN 2 ^ k > 3 2 p N -k 

a 
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Lemma 2. For any vertex v <G V, the probability of v being notified at the next time-step that a% is malicious is at 



least 1 — e 2 ™ and at most 1 — e 2 " (1 in™'. 

Proof. The probability that an agent located on a vertex u such that (u, v) € E will move to v at the next time-step 
is 7T~^ ■ The number of agents that are located in adjacent vertices to v is between |pjv and ^tPn- Therefore, the 

probability that v will not be reported about a-i at the next time-step is between (1 — ^r^)^ PN ' k and (1 — ^p^)^ PN ' k - 
Using the well known inequality (1 — x) < e~ x for x < 1, we can bound this probability from above by : 

Using the inequality (1 — x) > e _T ^ for x < 1, it will be bounded from below by : 



PN -k — 



( -(PN-n)- 1 ^ L)-i\5W-« - 2pjv( „_^L, 



2n(l- 



In order to guarantee that the graph is connected we know that Pjy > [19]. Therefore, the probability that i> will 

k 3fc 

be notified on the next time-step is at least 1 — e -2 ^ and at most 1 — e 2n(1 ~nni> . □ 
Interestingly, this fact holds for any positive pm (the density parameter of G). 

Lemma 2 states the probability that some vertex ceV will be reported of a, at the next time-step. This is in fact 
a Bernoulli trial with probability p 8UCCess . Now we bound the probability of failing this trial (not notifying vertex v 
enough times) after m steps. Let X v (m) denote the number of times that a notification message had arrived to v 
after m steps, and let F v (m) denote the event that v was not notified enough times after m steps (i.e. X v (m) < p). 
We additionally denote by F(m) the event that one of the vertices of G where not notified enough times after m steps 
(i- e - Uv<ev(g) -^t>( m ))- We use the Chernoff bound : 

P[X v (m) < (1 - o)p success m\ < e 2 
in which we set 5 = 1 — — — . We can then sec that : 

ttip success 

„r,, / \ i —fl £ -.2 mpsucccss 

P[X v {m) < p] < e (i rnpsucccss ' 2 

namely : P[F v (m)] < e p ~ mps " 2 cc<!SS . Applying the union bound we get : 

P[ei U e 2 U . . . U e„] < P[ ei ] + P[e 2 ] +... + P[e n ] 
on all n vertices of G. Therefore we can bound the probability of failure on any vertex v (using Lemma 2) as follows : 

_ _k_ 

7-. m/ M ^ mpsucccss m(l-e 2^ 

Pr[F(m)] < ne p 2 <ne p 2 < e 

3fc 

Assigning p sttccess = 1 — e _2 ^ (worse case) and p sttC cess = 1 — e 2 " (1_ n^ ) (best case), the rest is implied. □ 



Appendix B: Leeching and Muting Attacks 

In this section we give the complete proofs for the Theorems and Corollaries that appear in Section VII of the main 
body of the paper. 

Detailed Analysis — Completion Time. Recall we denote by p m ute the probability that a given network device 
may decide to stop generating vaccination messages and block some or all of the messages that are received by it. In 
addition, recall that T(n 7 p mu t e ) denotes the vaccination time of a network of n devices, with a probability of p m ute 
to block messages. Theorem 2 can now be revised in the following way : 

THEOREM 7 : The vaccination completion time of the TPP algorithm for some critical penetration pmax in 
probability greater than 1 — e, while at most n ■ p mute devices may block messages forwarding and generation, is : 

2(fl-ln^) 

T(n, Pmute ) = 



-tim.eout-p mutt . 



^PMAX -PN 
2T-N-(1 — E_) — 



while for the calculation of timeout we can use the expressions that appear in Theorem 2 or Theorem 3. 
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Proof. The expected number of new messages created at time t, k (t) is expected to be : 

f/vi n ^ n2 ' Pa > ' Pn n 771 \ 

k(t) = (1 -p route ) 7^-^ 

Given a message created with a TTL of timeout, at every time step it has a probability of p m ute to be sent to a 
network node which will not forward it onwards. Therefore, given a group of m messages, created at time t. Then at 
time t + i (for every i < timeout), only (1 — p mu te) 1 • m messages would remain alive. The average number of messages 
at any time step between t and t + timeout is therefore : 



1 (1 Pmute 

m 



^timeout 



timeout ■ p mute 

As we assume that V£ < Tvacdnation Pa t > Pmax , the number of agents k would be at least : 

1 - Pmute - (1 - Pmute) Ume ° Ut+1 n 2 ■ p M AX ' PN ^ £ ^ 
Pmute T • N 

Using again the fact that Va; < 1 (1 — x) < e~ x , we can see that : 

1 ^ „ — timeout-p mute ^2 „ „ 
> 1 ~ Pmute - g ^ n -PMAX [ PN ^ ^ ^ 

Pmute T • N 

Recalling Theorem 1 : 

, 2(p-In^) 

r(n,o)= f 

1 — e 2„ 

and assigning the revised expression for k, the rest is implied. □ 

COROLLARY 4 : When the number of blocking devices is greater than c • O (-j^tn) ^ or some c >• 1), the 
completion time of the TPP algorithm is affected as follows : 

T(n,p mute )<-^^-T(n,0f 

Pmute 

Proof. When r mu te ^> 1 Theorem 7 converges as follows : 

Observation 4. When p mute ~> timeout^ 1 , the vaccination time of the TPP algorithm is : 

2(p-ln^) 

T{n, Pmute) — i- Pmutc k-pmax-pn (■, p~T 

Assuming again that e = n~ Q s.i. aeZ + then provided that : 

1 - Pmute n ■ PMAX ' PN 



we see that : 



T(n,p mute ) 



: i -£;_)< 1 

4A-T-p mute (p+(a + l)lnn) 



(1 - Pmute) ■ Tl ■ PMAX ' Piv(l ~ 



Using Theorem 3, we shall calculate A Pmute , denoting the increase in the vaccination time as a result of the presence 
of the blocking nodes : 

. . 4N-T-p mutl! (p+( a +l)lnn) 

^ _ J- ( n i Pmute) _ (l-p mu te)-n-pMAX-PN(l-E- ) 

Pmut ° ~ T(n,0) ~ 4 / T.jV(p+(q+l)ln») 

V n-pMAX-PN-(l-E-) 

and after some arithmetics we can see that : 

a _ T(n, Pmute) _ iPmute 

A P _- r(nj()) ^ l _ pmute ' 1 ^ 

□ 
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Detailed Analysis — Cost. Recall we denote by M(n,p mu t e ) the overall cost of the TPP algorithm (messages sent 
+ monitoring) for a network of n devices, with a probability of p m ute to block messages. As shown in the following 
Corollary, the overall cost of the TPP algorithm remains unaffected by the presence of any given number of blocking 
devices. 

COROLLARY 5 : The overall cost of the TPP algorithm is unaffected by the presence of blocking devices. 
Namely : 

VPmute < 1 , M(n, Pmute) = M(n,0) 

Proof. We have already shown before (Corollary 3) that when the number of blocking devices is smaller than c-Q( pl " - ) 
(for some c<l), the system is approximately unaffected by the blocking devices. Therefore, we can expect that : 

M(n,p mute ) « M(n,0) 

Interestingly, this is also the case when the number of blocking devices is far greater. Recalling Observation 1 the 
cost of the algorithm with no blocking devices equals : 

M(n, 0) = O[k- T{n, 0) + — — ■ C M 
V n-p N 

Denoting by fc( n , Pmute ) the expected number of active agents at each time step, we have already shown in the proof 
of Theorem 7 that for large values of p m ute ■ 

k(n, Pmutc ) ~ - — Pmute ■ T(n,0) _1 ■ fc(„ ;0 ) 

Pmute 

The overall cost of the algorithm assuming large number of blocking devices equals : 

/ fc. . 

M(n, pmute) = O ki nPmute ) ■ T(n, pmute) H "' Pm " te C M 

V ' n-p N 

Assigning the values of fc( n ,p mute ) an< ^ T(n,p m ute) we can now see that : 



M(n, P mute) = O ( fc (n , 0) • T(n,0) + ^ C M 

i [n, uj • n ■ 



Pmute 



As we assumed that p m ute ^> we know that : 

1 ~ Pmute 1 p In n , -. 

< Y -C p In n - 1 

The expression for the overall cost of the algorithm can now be rewritten as : 

M(n, P mute) - O ( fc(„, 0) • T(n, 0) + -^-^C M ) 
V T(n,0) -n-p N J 

and as T(n, 0) = O(lnn), we get the requested result of : 

M(n,pmute) - O (k infi) ■ T(n, 0) + ^f^C M ^ = M{n, 0) 

We have shown that M(n,p m ute) — M(n,0) for p mu te ^ timeout^ 1 and well as for p mu te ^> timeout^ 1 . We shall 
now upper bound the overall cost of the algorithm for values of p m ute which are close to timeout^ 1 . For this, we shall 
find the value of p m ute which maximizes M(n,p m ute) ■ 

dM(n,p) _dk M dT(n,P) u , dk (n,p) Cm 

-1 [n,p) H — fc(n,p) 



dp dp dp dp n-pN 
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Recalling (from the proof of Theorem 7) that : 



we see that 



where: 



Therefore : 



, l- Pmute -e- Hmeout -P^ n 2 - PM AX-PN n „ , 
k > ' ™ — ~ (1 — ii_J 



Pmute 



T-N 



Hn,p) _ {p ■ timeout + 1) • e - timeoMt -P - 1 



dp 



a 



n -Pmax -Pn n t— i \ 

a = ^ ^ ( 1 — ^- ) 

T-N v ; 



Vp> , 



dk, 



(n,p) 



dp 



< 



and the number of agents is monotonously decreasing. 

Examining the behavior of the cleaning time, we can see that 



dT(n,p) _ dk {n>p) 2e" 



dp 



dp n 
1 nil — e 



where 



n ■ pmax • Pn 
2T- N- {l-E-)- 1 



Ok 



Using our observation concerning — t^ 21 , we see that : 



Vp > 



dT(n,p) 
dp 



> 



and the cleaning time is monotonously increasing. 

Returning to the derivative of the overall price function, let us divide it to two components. The first component 
representing the number of messages sent during the algorithm while the second representing the monitoring activities 
of the devices : 



dM(n,p) 
dp 



M 1 (p) + M 2 (p) 



where : 



and 



Mi(p) = 



dk 



dp 



T(n,p) + 



dT(n,p) 
dp 



(n,p) 



dp n ■ pn 

As fc( n ,p) is monotonously decreasing, the cost of the algorithm due to monitoring activities, represented by M-2(p), 
would be maximized when p m ute = (for which we have already shown that M(n,p) = M(n, 0)). 
As to Mi(p), it can be written as : 



Mi(p) = 



dk 



(n,p) 



where 



dp 



T{n,p)- 



(1 - e -P- x f) 



2 ""P 



x p ■ f3 



n ■ pmax • Pn 
2T- N ■ (l-E-)- 1 



26 



and : 

2 p g—timeout-p 

p 

Studying this function reveals that it has a single minimum point, while Mi(0) = and Mi(l) — > oo. This means 
that the maximal values of M(n 7 p mute ) are received either at p mute — or at p mute — 1- As we have already shown 
that at these points the overall cost of the algorithm remains unchanged, we can conclude that the overall cost of the 
algorithm at any point between these values of p mu te is also bounded by M(n, 0). □ 



Appendix C: A Optimization 

Recall that while analyzing the performance of the TPP algorithm we artificially divided the vaccination process 
into two phases. At the first phase, the vaccinating agents are generated, while the proliferation of the agents is done 
in the second phase. The ratio between the two phases was assumed to be 1, namely — we assumed that the two 
phases are identical in length. Following is the analysis of the selection of this ratio, which demonstrates that this is 
indeed the optimal division. 

Let us recall that : 

timeout = A • {T Generation + timeout) 

and that : 

- 1 ~ A ■ 

J- Generation ^ ' timeout 

As we assumed in the previous section that A = 0.5, it is interesting to know whether another value of A (perhaps, 
a value that depends of the properties of the network) may yield a lower completion time. 
Concerning Ty ac , it can now be seen that : 

Tvac < \ ■ timeout 
A 

Similar to the calculation of the number of agents k in Theorem2, we know that : 

2 

. _ n ' Pai ■ PNeighbor ^ ^ > 

i<TQ eneraUon 



. 1 — A V? ■ PMAX ' PNeighbor ^ \ 

> — - — • timeout ■ — 1 - E_) 

- A T ■ N y ' 

By selecting timeout — m we verify that the information propagation process will be completed successfully, and 
so we can now write : 

timeout = < 



1 - e- & " i _ e - V ■«""«»*■ n PMAX 2T p ^ hb °" ) 



Using a similar approximation as in the previous section, we shall assume that the requested e is polynomial in K 
We can therefore write : 



2(p +(a + l) In(n)) - Umeout(l - e -^-« m ^=^^r^(i-^-)) 
Using the bound (1 — x) < e~ x for x < 1 we can see that when assuming that : 

1 — A n ■ PMAX ■ PNeighbor „ ^, 

— - — • timeout ■ — ( 1 — £,_)< 1 

A 2T-A v ; 

we can write the previous expression as : 

o I / rt , oM I \ \ 1 _ ^ a 71 ' PMAX ■ PNeighbor n „ s 

2p + (2a + 2) ln(n) > — - — • timeout ■ — — — — (1 - E-) 

A 21 • iV 
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and therefore : 

AT ■ N (p + (a + 1) ln(n)) 



timeout < 
which means that : 



n ■ PMAX ■ PNeighbor ' (1 — E— ) 



1 I AT.N(p+(a + l)\n(n)) 

1 Vac > - t 



^ Y n 'PMAX ■ PNeighbor ' (1 — 

In order to find the optimal value of A we now calculate dT Q^ c ■ 



dT Vac 1-2A / 4T • N (p + (a + 1) ln(n)) 



<9A 2(A - A 2 ) 1 ' 5 y n ■ PMAX ■ PNeighbor ■ (1 - E-) 

It can now be easily seen that : 

, 1 dT Vac n d 2 T v 

ac n 

2 ^r~° ' ^a^ >0 

and therefore the value of A = 0.5 minimizes the upper bound over the algorithm's completion time. 
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